September 19, 2011

This file explains how to replicate the results in Bold, Kimenyi, Mwabu, and Sandefur, "Why did abolishing fees not increase public school enrollment in Kenya?"  

Data:

The paper relies on two sources of data: 

(i) Household survey data from the 1997 WMS and 2006 KIHBS, conducted by the Kenya National Bureau of Statistics (KNBS).  While the KNBS terms and conditions explicitly prohibit us from distributing the raw data directly, both data sets are publicly available through the KNBS web site.  Further details can be found here(http://www.knbs.or.ke/surveys.php), and a data request form here (http://www.knbs.or.ke/fillforms.php).

(ii) Standardized test scores from the KCPE exam.  The paper uses district average KCPE scores from public and private schools, respectively, as regressors in the enrolment choice model.  These scores are included in the kenya_fpe_replication.zip archive, in the file labeled kcpe_scores_110914.dta.  No other test score data is required to reproduce the results in the paper.  (NB: These district averages were calculated from school-level test scores using data files provided to us by the Ministry of Education.  We have included the Stata program "1 compute ave test scores by district.do" which documents this computation, and are happy to share the raw, school level scores with written permission from the Kenya National Examination Council.  Permission can be sought by writing to the Council Secretary, Dr. Paul Wassanga, Kenya National Examinations Council, P.O. Box: 73598 00200, Nairobi, Kenya, Tel:+254 020 246919 / 020-247204. Fax: +254-020- 2226032)

Programs:

The majority of the calculations were done in Stata, version 10.  The "FPE replication materials.zip" archive contains Stata ".do" files that begin with the raw household survey and test-score data and reproduce the results in the paper.  The programs should be run in the order specified in the file name (files with the same number have no necessary sequence).

"1 align WMS and KIHBS hh surveys.do", creates a repeated cross-section of enrolment data pre-/post-FPE.
"1 compute ave KCPE scores by district.do", aggregates school-level test scores to create the district averages included as kcpe_scores_110914.dta.
"2 prep data for clogit.do", cleans and reshapes survey data for conditional logit model of enrolment choice
"3 compute enrolment rates.do", estimates gross and net enrolment rates shown in Table 1 and Figure 1.
"3 run expenditure regs.do", estimates regressions reported in Table 2.
"4 run clogit - prim.do", computes coefficients and standard errors reported in Table 3 (col 1) and Table 4.
"4 run clogit - sec.do", computes coefficients and standard errors reported in Table 3 (col 2).

Two additional programs, "deltamethod.ado" and "inputcoeffs.ado", are called by the other programs and should be installed in the appropriate Stata directory.

Tables 3 and 4 also rely on intermediate calculations done in Matlab.  These can be found in the following files:
- "Bayer-Timmins rich poor.m" calls the following program
- "ConsistentExpectationsRP.m" which together provide only a starting value for the Bayer-Timmins estimation in Stata.

The comments in the Stata program "4 run clogit - prim.do" provide details on the interface between the Stata and Matlab results.  For convenience, we have manually entered the Matlab results into the Stata program so that users who wish to skip the intermediate Matlab step can still check all other steps in the estimation.

NB: The Stata programs are designed with the following folder structure in mind:
    - An input folder where all raw data files are located.  These files are used but never altered.  
    - An output folder where all new data files will be saved.
You should redefine the global macros "input" and "output" in the Stata program "1 align WMS and KIHBS hh survyes.do" to correspond to the location of these folders on your machine.
    